Insects are the most important global pollinator of crops and play a key role in maintaining the sustainability of natural ecosystems. Insect pollination monitoring and management are therefore essential for improving crop production and food security. Computer vision facilitated pollinator monitoring can intensify data collection over what is feasible using manual approaches. The new data it generates may provide a detailed understanding of insect distributions and facilitate fine-grained analysis sufficient to predict their pollination efficacy and underpin precision pollination. Current computer vision facilitated insect tracking in complex outdoor environments is restricted in spatial coverage and often constrained to a single insect species. This limits its relevance to agriculture. Therefore, in this article we introduce a novel system to facilitate markerless data capture for insect counting, insect motion tracking, behaviour analysis and pollination prediction across large agricultural areas. Our system is comprised of edge computing multi-point video recording, offline automated multispecies insect counting, tracking and behavioural analysis. We implement and test our system on a commercial berry farm to demonstrate its capabilities. Our system successfully tracked four insect varieties, at nine monitoring stations within polytunnels, obtaining an F-score above 0.8 for each variety. The system enabled calculation of key metrics to assess the relative pollination impact of each insect variety. With this technological advancement, detailed, ongoing data collection for precision pollination becomes achievable. This is important to inform growers and apiarists managing crop pollination, as it allows data-driven decisions to be made to improve food production and food security.
translated by 谷歌翻译
开发有效的自动分类器将真实来源与工件分开,对于宽场光学调查的瞬时随访至关重要。在图像差异过程之后,从减法伪像的瞬态检测鉴定是此类分类器的关键步骤,称为真实 - 博格斯分类问题。我们将自我监督的机器学习模型,深入的自组织地图(DESOM)应用于这个“真实的模拟”分类问题。 DESOM结合了自动编码器和一个自组织图以执行聚类,以根据其维度降低的表示形式来区分真实和虚假的检测。我们使用32x32归一化检测缩略图作为底部的输入。我们展示了不同的模型训练方法,并发现我们的最佳DESOM分类器显示出6.6%的检测率,假阳性率为1.5%。 Desom提供了一种更细微的方法来微调决策边界,以确定与其他类型的分类器(例如在神经网络或决策树上构建的)结合使用时可能进行的实际检测。我们还讨论了DESOM及其局限性的其他潜在用法。
translated by 谷歌翻译
由于推断,数据表示和重建属性,变异自动编码器(VAE)已成功地用于连续学习分类任务。但是,它们具有与持续学习过程中学到的类和数据库相对应的规格生成图像的能力(CL)尚不清楚,而灾难性遗忘仍然是一个重大挑战。在本文中,我们首先通过开发一个将CL作为动态最佳传输问题制定的新理论框架来分析VAE的遗忘行为。该框架证明了与数据可能性相似的范围,而无需任务信息,并解释了在培训过程中如何丢失先验知识。然后,我们提出了一种新颖的记忆缓冲方法,即在线合作记忆(OCM)框架,该框架由短期内存(STM)组成,该框架不断存储最近的样本以为模型提供未来的信息,以及长期记忆( LTM)旨在保留各种样本。拟议的OCM根据信息多样性选择标准将某些样本从STM转移到LTM,而无需任何监督信号。然后将OCM框架与动态VAE扩展混合网络结合使用,以进一步增强其性能。
translated by 谷歌翻译
最近,持续学习(CL)引起了巨大的兴趣,因为它使深度学习模型能够获取新知识,而无需忘记以前学习的信息。但是,大多数现有作品都需要了解任务身份和边界,这在实际情况下是不现实的。在本文中,我们在CL中解决了一个更具挑战性和更现实的环境,即无任务的持续学习(TFCL),其中模型在没有明确任务信息的非平稳数据流上培训。为了解决TFCL,我们引入了一个进化的混合模型,其网络体系结构动态扩展以适应数据分布移动。我们通过评估使用Hilbert Schmidt独立标准(HSIC)评估存储在每个混合模型组件中的知识与当前存储器缓冲区的知识之间的概率距离来实现此扩展机制。我们进一步介绍了两种简单的辍学机制,以选择性地删除存储的示例,以避免记忆超载,同时保留内存多样性。经验结果表明,所提出的方法可实现出色的性能。
translated by 谷歌翻译
在学习几个连续任务时,变形自身偏析器(VAES)遭受退化性能。这是由灾难性的遗忘引起的。为了解决知识损失,VAES正在使用生成重放(GR)机制或扩展网络架构(ENA)。在本文中,我们通过导出负面边际日志可能性的上限来研究VAE的遗忘行为。这个理论分析为VAE在终身学习期间忘记了先前学识渊博的知识提供了新的洞察。分析表示在ena框架下考虑模型混合物时实现的最佳性能,其中没有限制组件的数量。然而,基于ENA的方法可能需要过多的参数。这使我们提出了一种新颖的动态扩展图模型(DEGM)。根据与每个新数据库相关联的新颖性,DEGM扩展其架构,与从前任务中的网络已经学习的信息相比。 DEGM培训优化了知识结构,表征了与过去和最近学识的任务相对应的联合概率表现。我们展示DEGM保证了每个任务的最佳性能,同时还可以最小化所需的参数数量。补充材料(SM)和源代码在https://github.com/dtuzi123/expansion -graph-model中提供。
translated by 谷歌翻译
在视频数据中,来自移动区域的忙碌运动细节在频域中的特定频率带宽内传送。同时,视频数据的其余频率是用具有实质冗余的安静信息编码,这导致现有视频模型中的低处理效率作为输入原始RGB帧。在本文中,我们考虑为处理重要忙碌信息的处理和对安静信息的计算的处理分配。我们设计可训练的运动带通量模块(MBPM),用于将繁忙信息从RAW视频数据中的安静信息分开。通过将MBPM嵌入到两个路径CNN架构中,我们定义了一个繁忙的网络(BQN)。 BQN的效率是通过避免由两个路径处理的特征空间中的冗余来确定:一个在低分辨率的安静特征上运行,而另一个处理繁忙功能。所提出的BQN在某物V1,Kinetics400,UCF101和HMDB51数据集中略高于最近最近的视频处理模型。
translated by 谷歌翻译
在本文中,我们提出了一种新的视频表示学习方法,名为时间挤压(TS)池,这可以从长期的视频帧中提取基本移动信息,并将其映射到一组名为挤压图像的几个图像中。通过将时间挤压池作为层嵌入到现成的卷积神经网络(CNN)中,我们设计了一个名为Temporal Squeeze网络(TESNet)的新视频分类模型。由此产生的挤压图像包含来自视频帧的基本移动信息,对应于视频分类任务的优化。我们在两个视频分类基准上评估我们的架构,并与最先进的结果进行了比较。
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译
We address the challenge of building domain-specific knowledge models for industrial use cases, where labelled data and taxonomic information is initially scarce. Our focus is on inductive link prediction models as a basis for practical tools that support knowledge engineers with exploring text collections and discovering and linking new (so-called open-world) entities to the knowledge graph. We argue that - though neural approaches to text mining have yielded impressive results in the past years - current benchmarks do not reflect the typical challenges encountered in the industrial wild properly. Therefore, our first contribution is an open benchmark coined IRT2 (inductive reasoning with text) that (1) covers knowledge graphs of varying sizes (including very small ones), (2) comes with incidental, low-quality text mentions, and (3) includes not only triple completion but also ranking, which is relevant for supporting experts with discovery tasks. We investigate two neural models for inductive link prediction, one based on end-to-end learning and one that learns from the knowledge graph and text data in separate steps. These models compete with a strong bag-of-words baseline. The results show a significant advance in performance for the neural approaches as soon as the available graph data decreases for linking. For ranking, the results are promising, and the neural approaches outperform the sparse retriever by a wide margin.
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译